Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Adding continuous testing for ECS dynamic templates #97901

Merged
merged 19 commits into from
Sep 11, 2023

Conversation

eyalkoren
Copy link
Contributor

@eyalkoren eyalkoren commented Jul 24, 2023

Closes #96713

All three tests cases are covered:

  • a test document containing all ECS fields with random values in flattened form
  • a test document containing all ECS fields with random values in flattened form and index is set with subobjects: false
  • a test document containing all ECS fields with random values in nested/object form

In addition, I added verification that all ECS multi-field definitions are covered by the ECS dynamic templates (revealing two that actually were not).

@felixbarny @ruflin see if you agree with my comment about not failing if we create multi-field mapping even for fields of which ECS definition does not enforce such.
For example, we define a multi-field mapping for the *.name pattern. If you look into the ECS definitions, you will find that many *.name have multi-field mappings, while many others don't. Trying to be very accurate will result with many more dynamic templates.

@P1llus I assigned you as a reviewer because I think you were waiting for this in order to start migrating some ECS mappings to rely on the builtin dynamic templates.

@elasticsearchmachine elasticsearchmachine added v8.10.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jul 24, 2023
@eyalkoren eyalkoren requested review from felixbarny and P1llus July 27, 2023 15:55
@eyalkoren eyalkoren self-assigned this Jul 27, 2023
@eyalkoren eyalkoren added :Delivery/Build Build or test infrastructure :Data Management/Data streams Data streams and their lifecycles >test Issues or PRs that are addressing/adding tests labels Jul 27, 2023
@eyalkoren eyalkoren marked this pull request as ready for review July 27, 2023 15:58
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team Team:Delivery Meta label for Delivery team labels Jul 27, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny @ruflin see if you agree with my comment about not failing if we create multi-field mapping even for fields of which ECS definition does not enforce such.
For example, we define a multi-field mapping for the *.name pattern. If you look into the ECS definitions, you will find that many *.name have multi-field mappings, while many others don't. Trying to be very accurate will result with many more dynamic templates.

Seems like erring on the side of always creating a match_only_text subfield, even for cases where ECS only defines a keyword field for *.name fields.

I think that's fine. While on the one hand, this creates a bit more indexing overhead than strictly required, it brings a nice consistency via the naming convention *.name so that users know they can always do a full text search on fields ending with *.name.

@ruflin
Copy link
Contributor

ruflin commented Jul 31, 2023

I think that's fine. While on the one hand, this creates a bit more indexing overhead than strictly required, it brings a nice consistency via the naming convention *.name so that users know they can always do a full text search on fields ending with *.name.

I'm not a fan of the ECS multi fields but I agree we should be consistent. Do we have an understand on how much the overhead on storage is? I guess hard to tell? Assuming a user would want to disable this, I assume they could overwrite it in @custom and only have the keyword?

@eyalkoren
Copy link
Contributor Author

Do we have an understand on how much the overhead on storage is? I guess hard to tell?

I can't say anything clever about the actual overhead, only that since the default mapping for strings in Elasticsearch is multi-field, then I assume it is at least acceptable. We are only extending ECS a bit here 🙂

Assuming a user would want to disable this, I assume they could overwrite it in @Custom and only have the keyword?

Exactly! And since we added logs@custom before the ECS dynamic templates, any path or pattern coming from it will take precedence.

@eyalkoren
Copy link
Contributor Author

@rjernst @mark-vieira would you be able to review or assign someone to review this PR?

Copy link
Member

@P1llus P1llus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current review comments covers anything I would have added already. I see there was the discussion around multi-fields for .name, which was indeed something annoying for me when testing this out earlier, I did create an issue a long time ago in the ECS repo, but nothing really came of it.
I don't think the concept of multi-fields itself adds overhead, but text fields itself adds alot of overhead for sure, so if we had to choose, I rather not have them be multi-field but that decision is not up to me :)
Anything else seems LGTM.

@breskeby breskeby added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 11, 2023
@breskeby
Copy link
Contributor

I've added a daily ci job with notifications to slack (#es-delivery) and email ([email protected])

@breskeby breskeby merged commit c43f83d into elastic:main Sep 11, 2023
@eyalkoren eyalkoren deleted the ecs-test branch September 11, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Data Management/Data streams Data streams and their lifecycles :Delivery/Build Build or test infrastructure external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team Team:Delivery Meta label for Delivery team >test Issues or PRs that are addressing/adding tests v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Add ECS-mappings compatibility tests
9 participants